Members
Overall Objectives
Research Program
Application Domains
Software and Platforms
New Results
Partnerships and Cooperations
Dissemination
Bibliography
XML PDF e-pub
PDF e-Pub


Section: Software and Platforms

Leopar

Participants : Bruno Guillaume [correspondent] , Guy Perrier, Tatiana Ekeinhor.

Software description

Leopar is a parser for natural languages which is based on the formalism of Interaction Grammars  [40] . It uses a parsing principle, called “electrostatic parsing” which consists in neutralizing opposite polarities. A positive polarity corresponds to an available linguistic feature and a negative one to an expected feature.

Parsing a sentence with an Interaction Grammar consists in first selecting a lexical entry for each of its words. A lexical entry is an underspecified syntactic tree, a tree description in other words. Then, all selected tree descriptions are combined by partial superposition guided by the aim of neutralizing polarities: two opposite polarities are neutralized by merging their support nodes. Parsing succeeds if the process ends with a minimal and neutral tree. As IGs are based on polarities and under-specified trees, Leopar uses some specific and non-trivial data-structures and algorithms.

The electrostatic principle has been intensively considered in Leopar. The theoretical problem of parsing IGs is NP-complete; the nondeterminism usually associated to NP-completeness is present at two levels: when a description for each word is selected from the lexicon, and when a choice of which nodes to merge is made. Polarities have shown their efficiency in pruning the search tree:

Current state of the implementation

Leopar is presented and documented at http://leopar.loria.fr ; an online demonstration page can be found at http://leopar.loria.fr/demo .

It is open-source (under the CECILL License http://www.cecill.info ) and it is developed using the InriaGforge platform (http://gforge.inria.fr/projects/semagramme/ )

The main features of current software are:

One of the difficulties with symbolic parsing is that several solution can be produced for a single sentence and we want te be able to rank them. Tatiana Ekeinhor, during her second year Master Intership (from February to June 2013), implemented a ranker based on statistical techniques. Using the Sequoia TreeBank as a training corpus, she obtained an improvement of the system compared to the handcrafted rules.